Comparison between Expert Listeners and Continuous Speech Recognizers in Selecting Pronunciation Variants

نویسندگان

Mirjam Wester

Judith. M. Kessens

Catia Cucchiarini

Helmer Strik

چکیده

In this paper, the performance of an automatic transcription tool corpus is by modeling pronunciation variation [2]. is evaluated. The transcription tool is a continuous speech Another way of obtaining models which are less recognizer (CSR) which can be used to select pronunciation contaminated is to train PMs on read speech. It is well known variants (i.e. detect insertions and deletions of phones). The that the extent of variation in spontaneous speech is larger than in performance of the CSR was compared to a reference read speech. So, for read speech there will be fewer mismatches transcription based on the judgments of expert listeners. We between the speech signal and the transcriptions. Thus, it is to be investigated to what extent the degree of agreement between the expected that PMs which are trained on read speech will be less listeners and the CSR was affected by employing various sets of contaminated than those trained on spontaneous speech. phone models (PMs). Overall, the PMs perform more similarly to One can imagine that PMs with varying degrees of the listeners when pronunciation variation is modeled. However, contamination may cause the CSR to select different the various sets of PMs lead to different results for insertion and pronunciation variants. As a consequence, the degree of deletion processes. Furthermore, we found that to a certain agreement between the CSR and the reference transcription may degree, word error rates can be used to predict which set of PMs vary as a function of the PMs employed. The purpose of the to use in the transcription tool. present study is to investigate to what extent the degree of

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The selection of pronunciation variants: comparing the performance of man and machine

In this paper the performance of an automatic transcription tool is evaluated. The transcription tool is a Continuous Speech Recognizer (CSR) running in forced recognition mode. For evaluation the performance of the CSR was compared to that of nine expert listeners. Both man and the machine carried out exactly the same task: deciding whether a segment was present or not in 467 cases. It turned ...

متن کامل

Automatic Generation of Pronunciation Dictionaries

In this report we will describe a data driven approach for creating pronunciation dictionaries for a new unseen target language by voting among phoneme recognizers in nine different languages other than the target language. In this process recordings of the new language that are transcribed on word level are decoded by the phoneme recognizers. This results in a hypothesis of nine phonemes per t...

متن کامل

The roles of reconstruction and lexical storage in the comprehension of regular pronunciation variants

This paper investigates how listeners process regular pronunciation variants, resulting from simple general reduction processes. Study 1 shows that when listeners are presented with new words, they store the pronunciation variants presented to them, whether these are unreduced or reduced. Listeners thus store information on word-specific pronunciation variation. Study 2 suggests that if partici...

متن کامل

Accent and television journalism: evidence for the practice of speech language pathologists and audiologists.

PURPOSE To analyze the preferences and attitudes of listeners in relation to regional (RA) and softened accents (SA) in television journalism. METHODS Three television news presenters recorded carrier phrases and a standard text using RA and SA. The recordings were presented to 105 judges who listened to the word pairs and answered whether they perceived differences between the RA and SA, and...

متن کامل

Automatic text-independent pronunciation scoring of foreign language student speech

SRI International is currently involved in the development of a new generation of software systems for automatic scoring of pronunciation as part of the Voice Interactive Language Training System (VILTS) project. This paper describes the goals of the VILTS system, the speech corpus, and the algorithm development. The automatic grading system uses SRI’s DecipherTM continuous speech recognition s...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1999

Comparison between Expert Listeners and Continuous Speech Recognizers in Selecting Pronunciation Variants

نویسندگان

چکیده

منابع مشابه

The selection of pronunciation variants: comparing the performance of man and machine

Automatic Generation of Pronunciation Dictionaries

The roles of reconstruction and lexical storage in the comprehension of regular pronunciation variants

Accent and television journalism: evidence for the practice of speech language pathologists and audiologists.

Automatic text-independent pronunciation scoring of foreign language student speech

عنوان ژورنال:

اشتراک گذاری